144-2008: In Search for a Golden Algorithm
نویسندگان
چکیده
This paper describes an application of data mining methods for development of an HIVcasefinding algorithm with SAS Enterprise Miner (EM) 5.2. Access to HIV care depends on accurate identification of all infected persons. The Veterans Health Administration (VHA) provides care to ~20,000 HIV-infected veterans. The current algorithm for patient identification into the Registry is based only on HIV-specific diagnostic codes. We built logistic regression (LR), decision tree (DT), and neural network (NN) models to predict a binary outcome variable HIV status. We applied these models to the VHA population to identify patients with high predicted probability of disease. In addition to the diagnostic codes we were using demographic, geographic, laboratory, pharmacy and service utilization variables. False Negative (FN) rates and Area Under the Curve (AUC) indices were used for model comparisons. Our best models outperformed the reference model (RM) both in terms of lower FN rate and higher AUC index. The lowest FN rate (0.010% vs. 0.016% for the RM) was demonstrated by the NN model, while the highest AUC index was observed for the LR model (0.995 [0.994, 0.996] vs. 0.974 [0.971, 0.977] for the RM). Non-HIV-specific variables selected by our models included age, race/ethnicity, marital status, service-connected disability, number of days hospitalized, number of primary care and social work visits, number of total and lipids lab tests, blood pressure and liver co-morbidities. Apart from those already on the registry, new algorithms have identified additional 5% new cases. Using EM, our approach can be applied to other disease registries where electronic clinical data are available.
منابع مشابه
Comparison of golden section search method and imperialist competitive algorithm for optimization cut-off grade- case study: Mine No. 1 of Golgohar
Optimization of the exploitation operation is one of the most important issues facing the mining engineers. Since several technical and economic parameters depend on the cut-off grade, optimization of this parameter is of particular importance. The aim of this optimization is to maximize the net present value (NPV). Since the objective function of this problem is non-linear, three methods can b...
متن کاملIterated Local Search Algorithm for the Constrained Two-Dimensional Non-Guillotine Cutting Problem
An Iterated Local Search method for the constrained two-dimensional non-guillotine cutting problem is presented. This problem consists in cutting pieces from a large stock rectangle to maximize the total value of pieces cut. In this problem, we take into account restrictions on the number of pieces of each size required to be cut. It can be classified as 2D-SLOPP (two dimensional single large o...
متن کاملGeneralized Cyclic Open Shop Scheduling and a Hybrid Algorithm
In this paper, we first introduce a generalized version of open shop scheduling (OSS), called generalized cyclic open shop scheduling (GCOSS) and then develop a hybrid method of metaheuristic to solve this problem. Open shop scheduling is concerned with processing n jobs on m machines, where each job has exactly m operations and operation i of each job has to be processed on machine i . However...
متن کاملImproved Cuckoo Search Algorithm for Global Optimization
The cuckoo search algorithm is a recently developedmeta-heuristic optimization algorithm, which is suitable forsolving optimization problems. To enhance the accuracy andconvergence rate of this algorithm, an improved cuckoo searchalgorithm is proposed in this paper. Normally, the parametersof the cuckoo search are kept constant. This may lead todecreasing the efficiency of the algorithm. To cop...
متن کاملAdaptive search area for fast motion estimation
In this paper a new method for determining the search area for motion estimation algorithm based on block matching is suggested. In the proposed method the search area is adaptively found for each block of a frame. This search area is similar to that of the full search (FS) algorithm but smaller for most blocks of a frame. Therefore, the proposed algorithm is analogous to FS in terms of reg...
متن کاملA Cuckoo search algorithm (CSA) for Precedence Constrained Sequencing Problem (PCSP)
Precedence constrained sequencing problem (PCSP) is related to locate the optimal sequence with the shortest traveling time among all feasible sequences. In PCSP, precedence relations determine sequence of traveling between any two nodes. Various methods and algorithms for effectively solving the PCSP have been suggested. In this paper we propose a cuckoo search algorithm (CSA) for effectively ...
متن کامل